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Since 1927 and until recently, most models describing the spread of disease have been of compartmental 
type, based on the assumption that populations are homogeneous and well-mixed. Recent models have 
utilised agent-based models and complex networks to explicitly study heterogeneous interaction patterns, 
but this leads to an increasing computational complexity. Compartmental models are appealing because of 
their simplicity, but their parameters, especially the transmission rate, are complex and depend on a number 
of factors, which makes it hard to predict how a change of a single environmental, demographic, or 
epidemiological factor will affect the population. Therefore, in this contribution we propose a middle 
ground, utilising crowd-behaviour research to improve compartmental models in crowded situations. We 
show how both the rate of infection as well as the walking speed depend on the local crowd density around an 
infected individual. The combined effect is that the rate of infection at a population scale has an analytically 
tractable non-linear dependency on crowd density. We model the spread of a hypothetical disease in a 
corridor and compare our new model with a typical compartmental model, which highlights the regime in 
which current models may not produce credible results. 



During its history, humanity has repeatedly encountered major epidemics such as the Plague (Black Death) 
in the 14* century, the Spanish Flu in 1918 or more recently the Swine Flu (2009), to mention only a few. 
Some of these epidemics severely affect the global population, such as the Plague that in only six years 
caused the death of almost one third of the European population. 

One of the first analytical approaches to study and model the systemic nature of the spread of an infectious 
disease, was Kermack and McKendrick^ who proposed what would later be modified and defined as an SIR model, 
that divided a population into three compartments; (i) susceptible, S; (ii) infectious, I; and (iii) recovered/ 
removed, R, and devised a set of differential equations to model the transition rates from each of these states 
to the others. This seminal work would lead to the onset of a new field of Mathematical Epidemiology. By 
introducing also an (iv) exposed class, £, the model becomes what is called an SEIR model^'^: 



S= -MS/N 
E = US/N-kE 
i = kE-ocI 



(1) 



where S + E + I + R = N (population size). 

To make this model more practical for different population sizes, we will now normalise the model according to 
s = S/N, e = E/N, i = I/N, and r = i^/iV which results in Eq. (2): 

s= — lis 



e = lis 
i = ke — oci 



he 



(2) 



r = ai 
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where k is the per-capita transmission rate, a is the constant recovery 
rate (thus, the mean infectious period is 1/a) and 1/k is the mean 
exposed period. 

The SEIR model is based on the following four main assumptions: 

1. The population size N is constant. 

2. There is no heterogeneity, i.e. the impact of individual charac- 
teristics such as age, sex and behaviour are neglected. 

3 . The introduction of an exposed class means that individuals are 
not immediately able to infect others upon contracting the 
disease themselves 

4. The population is well-mixed. This means that each individual 
has the same probability of contracting the disease. 

Therefore, such an approach will neglect the importance of het- 
erogeneity in the social interaction and mobility patterns of indivi- 
duals. The validity of the 'well-mixed' assumption is particularly 
problematic. Recent Statistical Physics approaches however, have 
incorporated such elements into disease-spreading models^'^ using 
transport networks as a proxy of human mobility, which has resulted 
in higher fidelity in predictions of disease outbreaks. 

A growing body of literature on detailed studies of crowd beha- 
viour in crowded places have revealed a number of insights into non- 
linear, dynamic, adaptive and self-organised crowd behaviour that 
seriously question the validity of the well-mixed assumption in the 
original SIR model, even if network effects are taken into account to 
achieve some degree of heterogeneity. 

Since the beginning of the century, mathematical epidemi- 
ology has found in the use of complex networks a new and more 
precise way to tackle the problem; where individuals are represented 
by nodes and their interactions by links. Moreover the nodes have a 
certain connectivity distribution i.e. the probability P(k) that a node 
is connected to k other nodes. This connectivity is typically either 
exponential (in case of exponential networks) or power-law (in case 
of scale-free networks). An important result was obtained in Ref 14 
where it as shown that exponential networks show an epidemic 
threshold that separates an infected phase from a healthy one, while 
the behaviour of scale-free networks depends on the power-law 
exponent and may or may not represent the epidemic threshold. 

Extensive research has been focused on the study of epidemics 
through network theory. However, while contemporary network 
science approaches^'^^'^^ to modelling the spread of disease has greatly 
improved the modelling accuracy at large scales, there is still limited 
understanding of the effect of crowd behaviour that come into play in 
densely crowded and confined spaces such as transport hubs, mass 
gatherings, and city centres. 

In recent years the study of human interaction and mobility has 
provided some important insights that are needed to increase the 
realism of epidemic models, mostly at large scales^^. In particular, 
sensors such as mobile phone tracking, WiFi/Bluetooth devices, and 
CCTV are used to acquire important information of how people 
move in their daily lives as well as during mass gatherings or other 
crowded events. 

On an even smaller scale, interesting experiments are being carried 
on in small and crowded environments (conference halls, hospital 
wards, museums etc) by using wearable sensors (such as RFID 
devices) that are capable of sensing face-to-face interaction of indi- 
viduals^^"^°. These studies have shown the importance of heterogen- 
eity and dynamic contact patterns in shaping the dynamics of the 
infection, by the use of simulations, temporal networks and contact 
matrices. However, we still lack an analytical instrument that allows 
us to find a common design and evaluation strategy of the problem in 
general cases. 

Therefore, in this contribution, we propose a way in which insights 
into crowd behaviour can be used to improve prevailing compart- 
mental models (well known for their simplicity). A microscopic 
foundation is particularly important to evaluate potential epidemio- 



logical implications of designs or operational plans for one-off events 
where we do not have the luxury of existing data that can be analysed 
and used as a basis for an epidemiological model. For recurring 
events, our study will provide an important analytical toolkit that 
can be used to study the epidemiological implications of changing 
one of the crowd-related design or operational parameters, and keep- 
ing everything else constant. 

By analysing a tractable scenario-people moving in a corridor- 
and building an SEI model, we focus our attention at one of the most 
elusive parameters of mathematical epidemiology: the contact rate. 
We will study how contact rate depends on crowd behaviour, and 
most importantly, the crowd density. 

Results 

We recognise that human mobility in crowded places exhibit a rich 
set of complex, adaptive and self-organised behaviours^ Therefore, 
we propose a bottom-up modelling approach to the spread of disease 
in crowded places, starting with a simple but analytically tractable 
special case, that can later be extended to take a richer set of crowd 
behaviours and more complex environments into account. 

Traditional SEIR model in a corridor. Before we introduce our new 
methodology taking crowd behaviour into account, let us start by 
implementing the traditional SEIR model in a corridor with unidi- 
rectional crowd flow. The length of the corridor is L metres, the width 
is w metres and the corridor has periodic boundary conditions (see 
Fig. 1). In this environment, 1% of the population is infected at time 
0. Adding to the previous SEIR model assumptions, we also have the 
following corridor-specific assumptions: 

5. Periodic boundary conditions (and thus, a constant population 
size N) is assumed in the corridor. 

6. The time scale (i.e. maximum simulated time) is smaller than the 
recovery time and exposed time (i.e. only the exposed and sus- 
ceptible classes will be affected within simulated time). This also 
means that we will not take secondary infections into account. 

Now, since we aim to improve rather than to replace current mod- 
els, we will put our model into prevailing terminology, and thus, each 
of the individuals in our model can take any of the states (i) suscept- 
ible, s; (ii) infectious, /; and (iii) exposed, e, which represents indivi- 
duals that have become infected after contact with an infected person 
but who are not yet infectious due to the relatively short duration of 
our simulation, for this same reason the compartment r is not pre- 
sent. We will thus obtain the normalised SEI model described by 
Eq. (3). 

s= — lis 

e = lis (3) 
i = 0 

where /q — 0-01 (and s + e + i = 1 since we do not have the 'r' 
compartment anymore). 

Improved density-dependent SEIR model. Since we want to take 
crowd behaviour explicitly into account, we will now propose an 
improved model where X will depend on the spatio-temporal 
distribution of the crowd density. 

Earlier work^^"^^ has studied distances that large droplets are car- 
ried; (i) 6 m away by exhaled air at a velocity of 50 m/s (sneezing); (ii) 
2 m away at a velocity of 10 m/s (coughing); and (iii) 1 m away at a 
velocity of 1 m/s (breathing). 

We will build on these findings and therefore replace the earlier 
assumption 4 (i.e. 'The population is well mixed) with the following: 
Infections can only happen within an i^-metre radius of an infected 
individual. The final new assumption that the new model will be 
based on is: 
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Figure 1 | Illustration of corridor with periodic boundary conditions, 
length L metres and width w metres. The small black circle shows a single 
infected individual, the larger dashed circle shows the radius R of infection, 
the smaller grey circles show individuals within the radius of possible 
infection, and the small white circles show individuals that are currently 
outside the reach of possible infection. 

7. The mixing rate (i.e. the arrival rate of new pedestrians) around 
infected individuals due to variable walking speeds is much 
higher than the rate of infection, which means that we do not 
need to take local saturation effects into account. 

In previous work^'^, the per-capita transmission rate k has been 
defined as a constant, reflecting the average number of contacts 
within a population. Such an approach can lead to accurate out- 
break predictions only if k is calibrated to specific cases, but it 
does not incorporate the effect of crowd behaviour which can 
lead to characteristically different contact rates in different scen- 
arios. We also assume that the mixing rate (i.e. the rate of arrival 
of new pedestrians) around infected individuals due to variable 
walking speeds is much higher than the rate of infection (which 
means that we do not need to take local saturation effects into 
account). 

In our case, each infected individual can only reach n other indi- 
viduals within a 1 -metre radius (see Fig. 1). 

In reality, this number n{t)is time dependent, n{t) = p{t)nR^, since 
it depends on the local density p{t) (1/m^) of the crowd surrounding 
each infected individual, multiplied by the area nR^ where an infec- 
tion can take place. 

Taking time-dependent crowd behaviour into account would 
make our model (Eq. (3)) too complicated for practical purposes. 
Therefore, we will re-formulate crowd-dynamic theory into a 
statistical description that we can later incorporate into our 
existing model (Eq. (3)). To do so, we will start with a previously 
discovered Eulerian description^ of the crowd density distri- 
bution p\ and re-formulate this into a Lagrangian description 

that will better capture the effect that pedestrians spend longer 
time in crowded areas due to their lower walking speed in such 
areas. 

We know from Ref. 8 that if a random location is picked some- 
where within an open space with area A (m^) and the average density 
Q (of the whole space) is q = N / A(m~^^, the local density p (m"^) 
can be described by a Gamma distribution, according to Eq. (4). 



/(p;A;5) = 



r(A)' 



(4) 



where A = 3^ = 3^ = / (Lw), g is the global density of the whole 
corridor, and 5 = 3. 

Moreover, if a random pedestrian / (in our case an infected indi- 
vidual) is picked at a random time U the local density p around that 
individual / will follow the distribution described by Eqs. (5) and (6). 

f{p]A-B) 1 



p\p-,A-B)- 



(v(p)> 



Pn 



{v{p))dp 



Normalisation factor 



0.6 



.9 0.5 



Spatial distribution 

Temporal distribution p^ 




Local crowd density p (m ) 



Figure 2 | Probability- density of the spatial distribution of the local 
crowd density in a space with an average crowd density 
^ = N/A = 2.5m~^asan example, compared with the (spatio-)temporal 
distribution of local density across the same space. The difference is due 
to a lower walking speed in high density areas, which will lead to 
individuals spending longer time in dense areas compared to lower density 



0.05<(v(p)) = - 



0.5 



<1.34 



(6) 



is the average velocity (1.34 m/s is the average value of the free speed 
when the corridor is empty^) and p^ax is the maximum value of the 
density. 

The probability density functions of the crowd density over space 
ip') and space-and-time (pO are illustrated in Fig. 2. The reason why 
p^ is skewed towards larger crowd density values compared to p' is 
because pedestrians walk slower when they are in dense regions 
compared to less dense regions, which means that they will spend 
more time in crowded areas compared to less crowded areas. 

Because walking speed is a decreasing function of crowd density, 
individuals spend relatively longer periods of time in crowded areas 
compared to less crowded areas. Therefore, from an individual per- 
spective, the average value of the local density around an infected 
individual is given by Eq. (7). 



Jo 



(7) 



which means that, substituting this value in the definition of the 
contact rate we obtain: 



^density-dependent ^^2^ 



p\p)pdp 



(8) 



0 



Number of pedestrians in circle 



where: 



where c is the transmissibility i.e. the fraction of infected- susceptible 
contacts that actually lead to an infection. 

In this way, we have obtained a new description of the transmis- 
sion rate between pedestrians in a crowded location, a value that 
depends on the crowd density. With the previous model definition, 
the value of the transmission rate did not depend on the crowd 
density in the corridor which is the case for the new model. 

Numerical results. Let us now compare our new density- dependent 
model with the previous model to investigate what effect a crowd- 
density- dependent per-capita infection rate ^.density-dependent ^^jj heive 
on the population scale. As can be seen in Fig. 3, the rate of infection 
per unit time is a non -linear function of crowd density. For practical 
purposes, the infection rate per distance walked is a better metric than 
the infection rate per unit time. For example in a busy transport hub. 
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Figure 3 | Dependency of per-captita infection rate X on average crowd 
density q = N /A. Note that the rate of infection per 0.1 metre walked, 
peaks at high crowd density due to the decreasing walking speed at high 
crowd density. 

people have to walk a fixed distance from one platform to another 
regardless of the level of crowdedness and will therefore spend longer 
time altogether in that transport hub on crowded days (thus, being 
even more affected by prevalent diseases in that space). 

To show the population-scale impact of our proposed model, we 
simulate a hypothetical disease that spreads between two individuals 
when they are at close proximity. We set the constant per-capita 
transmission rate to ^1'=°"^*^"^ = 0.01. If we take (p) =1.0 m"^ as an 
arbitrary crossover point between 2^°"^tant ^density-dependent^ ^ 
= 0.01/(7ii^') which will define ^density-dependent accordlng to Eq. (8). 

We now run our old and new density- dependent SEI models next 
to each other for different values of average density in the interval 
Q = iV/Ae [0,8] (m~^) . We then plot the model results for the fraction 
of exposed individuals in the population, for a population size N = 
1000, as a function of both time t, and average crowd density g (see 
Fig. 4 for results). It is clear that even though the old and the newly 
proposed model give comparable results within a narrow range of 
crowd densities, for large portions of the density spectrum, the old 
model either severely under- or over-estimates the rate of infection of 
the disease. 

Model validation. A comprehensive quantitative model validation is 
not within the scope of this paper. However, to be able to validate the 
main characteristics of our model, we will run an agent-based crowd 
model of unidirectional crowd flow in a corridor. Our crowd model is 
based on the social-force modeF^ where the motion of pedestrian / is 
described by 



dt 



=m 



(9) 



where m is the mass of the pedestrian (in kg), v is the walking velocity, 
and// is a force described by 

^ k 

where fij is a repulsive force from pedestrian; acting on pedestrian / 
specified as 



fij = Fexp{-dij/Do)eij 



11 



where dij is the distance from pedestrian / to pedestrian j, Cij is the 
normalised vector pointing from the centre of mass of pedestrian) to 
the centre of mass of pedestrian /. 




Average crowd density (m 



0 0 



Time (hours) 



Figure 4 | Numerical results showing how the fraction of individuals in 
the exposed state e depends on time t and average crowd density g = N/A. 
The two surfaces correspond to a traditional compartmental model 
(referred to as 'old SEI model' and the model we propose in this paper 
referred to as 'density-dependent SEI model'). 

The force fik from boundary k onto pedestrian / is also specified by 
Eq. (11), but the position of pedestrian; is replaced by the closest 
point on boundary k. 

The model parameters used are the same as in Ref. 21. Note that 
our model does not use the second force component {Di = dijY used 
to accurately model turbulent and very dense crowds, since that is not 
the focus of our paper. The free speeds v° are Gaussian distributed 
with mean 1.34 m/s and standard deviation 0.26 m/s when pedes- 
trians are unobstructed. However, speeds are also bounded by 

Kdf^'it)/!! where df^'it) is the minimum net-distance to all 
surrounding pedestrians j and T/ is the preferred net-time headway 
that are Gaussian distributed with mean 0.5 s and standard deviation 
0.1 s. 

The mass m of pedestrians are Gaussian distributed with mean 
60 kg and standard deviation 10 kg; the maximum pair- wise repuls- 
ive force is set to F = 160 N; the relaxation time is set to t = 0.5 s; and 
the remaining parameter Dq that determines the typical length scale 
of the repulsive forces is set to Dq = 0.31 m. 

The results of our agent-based model (see Fig. 5) matches the main 
characteristics of our model results as shown in Fig. 3. Notably, the 
number of infections per metres walked peak at an average crowd 
density above 6 m~^. This highly non-linear behaviour which is the 
result of many interactive bodies would normally need a microscopic 
treatment. However, as we have shown here, the outcome of this 
non-linear behaviour can be accurately accounted for by our mac- 
roscopic equations where the only input needed is the average crowd 
density. This opens up new application areas for compartmental 
models where they have been less suitable in the past, such as envir- 
onments containing large and dense crowds. A more comprehensive 
model validation will be carried out in future work, to empirically test 
the accuracy of our model predictions in dense and confined spaces, 
when such data becomes available. 

Discussion and future work 

Our study shows a way in which detailed insights gained from data- 
driven crowd research can be utilised to improve current disease 
spreading models in an analytically tractable way. Thus, the spread 
of disease depends strongly on the behaviour of the crowd, and in 
particular the contact rate is a key parameter in the study of the 
evolution of a disease and it varies considerably depending on the 
density of the population in the studied environment. 
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Methods 

Figure 2 was produced by numerically solving Eqs. (4)-(6) for p on the interval [0.01, 
8] with a step size of 0.01. 

Note that the average walking speed {v{p)) is truncated at an upper boundary at 
1.34 m/s (i.e. the 'free' unobstructed walking speed), and by a lower boundary at 0.01 
(m/s). All intermediate walking speeds are calculated using the non-linear function 
specified by Eq. (6). 

Figure 3 was produced by solving Eq. (8) for 100 values of p equally distributed on 
the interval [0.01, 8]. 

Figure 4 was created by solving Eq. (3) for a population size N = 1000, using our 
proposed density- dependent specification of 1 obtained from Eq. (8) and comparing 
the results to the number of exposed individuals produced using a constant value 
^^constant^ The range of crowd density is exactly the same as the one used in Fig. 3. The 
time axis uses a time step of 60 seconds and the time range used is [0, 6 X 3600] 
seconds (i.e. six hours). 

Figure 5 was obtained by repeatedly running an agent-based crowd model 10 times 
for each of 40 different values of crowd density in the range [0, 8] m"^ using N = 200 
pedestrians in a corridor with periodic boundary conditions. The model ran for 10 
simulated seconds and data from t<2 s was excluded because of the time needed for 
pedestrians to accelerate from their initial zero speeds at the start of the simulation. 
The model equations (9)-(ll) were solved numerically using the 1st order Euler 
method with a time step of 0.05 s. 
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Figure 5 | Top: Results of the agent-based model (ABM) of a 
unidirectional crowd flow in a corridor. The curves correspond to the 
mean values of 10 simulations for a number of average crowd densities, and 
the error bars correspond to 1 standard deviation. The blue dashed curve is 
the same as in Fig. 3 for comparison. Bottom: Snapshot of the agent-based 
crowd model. Colours correspond to: susceptible (yellow); infected (red); 
and exposed (blue). 

We have found that three different but related effects of crowd 
behaviour affect the rate of infection in a confined space. All three 
effects contribute to an increased infection rate for increasing crowd 
density; (i) increasing crowd density leads to decreased proximity, 
but this leads to a higher number of individuals within the range of 
infection around an infected individual; (ii) density- dependent walk- 
ing speeds lead to individuals spending longer periods of their time in 
crowded locations compared to less crowded locations within an area 
with locally varying crowd densities (thus, the mean crowd density 
underestimates the typical proximities between individuals); and (iii) 
for the same reason, individuals also spend longer time altogether in 
environments such as transport hubs on crowded days compared to 
less crowded days, which leads to an increasing rate of infection per 
distance walked. 

Comparing our new model with existing models, we have shown 
how existing models may give significant over- or under-estimations of 
the spread of disease in certain settings, and we have also shown how 
current models can be improved by a modification of the k parameter. 

Moreover, the use of a generic definition of the transmission rate 
that depends on the transmissibility of the disease, the area considered 
and the density of the population, is a powerful tool that can be used 
in many different crowd scenarios, and will decrease the dependency 
on a-posteriori empirical transmission rate data every time one of the 
environmental or crowd-related parameters have changed. 

Having a more precise prediction of what may, or may not, happen 
in crowded environments in terms of contagion between individuals 
could be highly useful to be able to devise adaptive control strategies 
during mass gatherings, in busy transport hubs, or in busy city centres. 
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